Skip to content

Conversation

@sbraconnier
Copy link
Contributor

@sbraconnier sbraconnier commented May 17, 2025

Fix Intermittent ConcurrentModificationException using getMergedConnectorMessage (#92)

@jonbartels
Copy link
Contributor

@sbraconnier - I see what the change does and I see how it could prevent a concurrent modification exception.

Can you link to an issue or discussion or other info about why or when the issue happens?

@sbraconnier
Copy link
Contributor Author

sbraconnier commented May 21, 2025

@jonbartels Sorry I only added the reference in the title. I updated the PR description to link the original issue.

Copy link
Member

@tonygermano tonygermano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this may not actually fix the issue. I can't find any indication that this method would be called concurrently from two threads. The stack traces from the two nextgen issues don't have mirth versions on them, but they indicate that the problem occurs on one of these two lines (the line numbers shifted in mirth 4.4.0):

responseMap.putAll(connectorMessage.getResponseMap());
channelMap.putAll(connectorMessage.getChannelMap());

The map being modified concurrently is the argument to the putAll method. Since this method doesn't update that map, that means there is probably another thread that isn't trying to call this same method, but is instead updating the responseMap or channelMap for the same connectorMessage while this method is trying to read it.

@sbraconnier
Copy link
Contributor Author

Hmm you're right. I think it would be nice to let the synchronized block since the method is building the mergedConnectorMessage field and it will ensured that 2 threads won't initialized it concurrently. I'll take a look at the putAll calls.

@tonygermano
Copy link
Member

My guess without digging into it too much is that a destination queue thread is updating either the responseMap or the connectorMap while the post-processor is trying to build the mergedConnectorMessage.

It may be that just setting a couple retry attempts within this method if a ConcurrentModificationException is thrown would take care of it since this is pretty rare, and that wouldn't introduce the performance penalties associated with synchronization between threads.

@tonygermano tonygermano requested review from a team, NicoPiel, jonbartels, kayyagari, kpalang, mgaffigan and tonygermano and removed request for a team January 14, 2026 21:34
Copy link
Contributor

@mgaffigan mgaffigan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know the ostrich algorithm is a thing, but this seems like a bad idea to try again. ConcurrentModificationException is a signal we're violating a collection's synchronization contract. It's being "nice" and telling us that so we can find where we went wrong.

If we don't want to synchronize, I'm sure there's a lock-free collection we could use, but I don't imagine that's required.

Why are there multiple threads involved in message processing? I thought each channel thread serially processed messages.

@tonygermano
Copy link
Member

I know the ostrich algorithm is a thing, but this seems like a bad idea to try again. ConcurrentModificationException is a signal we're violating a collection's synchronization contract. It's being "nice" and telling us that so we can find where we went wrong.

If we don't want to synchronize, I'm sure there's a lock-free collection we could use, but I don't imagine that's required.

Why are there multiple threads involved in message processing? I thought each channel thread serially processed messages.

@mgaffigan From reading the issue and the related issues from the nextgen repo, the problem only seems to manifest in high volume channels, and only on rare occasions. I've never seen it before. I have not tried to reproduce it, but I gave my best guess at what is happening in my previous comment. A destination queue thread and post-processor thread are the only scenario I can think of where one might be writing and the other reading the same data structure (one of the message maps, probably the responseMap, in this case) at the same time.

I'm open to alternative suggestions if we can pinpoint where the problem actually is. I gave the suggestion as a quick fix to an old problem that was likely to minimally impact people who were not experiencing the problem (most of us.)

@mgaffigan
Copy link
Contributor

mgaffigan commented Jan 15, 2026

@tonygermano
I admit I do not understand the issue. I do not like the solution as proposed - that might mean we need a more concise repro.

From my understanding, the design is:

  • Destinations have their own responseMap, avoiding the risk of multithreading despite running in parallel
  • The destination response maps are reduced after all destination chains ("Wait for previous destination") complete
  • Postprocessor runs after all destinations complete
  • Queueing does change the above

Of the four assertions above, I'm least confident in the fourth.

I would prefer a change which improves error messaging by failing fast when an assumption is broken.

@kpalang
Copy link
Contributor

kpalang commented Jan 15, 2026

I'd agree here that a minimal reproducible sample should be provided before moving on with this PR.

I suggest looking into dedicated load-testing tools. The first one I can think of is wrk, but there are definitely others.

@tonygermano
Copy link
Member

This is obviously a case of something being accessed concurrently that isn't expected to be accessed concurrently. For reference, the description for the HashMap class provides the following information, which I expect applies to how the putAll iterates over the argument's EntrySet.

The iterators returned by all of this class's "collection view methods" are fail-fast: if the map is structurally modified at any time after the iterator is created, in any way except through the iterator's own remove method, the iterator will throw a ConcurrentModificationException. Thus, in the face of concurrent modification, the iterator fails quickly and cleanly, rather than risking arbitrary, non-deterministic behavior at an undetermined time in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants